Tim Udelsman | 5/8/2017 | Prof. Csaba
I intend to examine the relationship between the CFTC's Commitment of Traders reports and future commodity price movements. If any predictive power is determined, I will then construct a simulated portfolio that trades over time with the market and compare that to historical market performance. There are a few caveats to take into account with this...
In [26]:
# Import necessary packages
# Import Quandl and FRED APIs with designated keys
import quandl
quandl.ApiConfig.api_key = '7D_fgezPtnQmwSEpAp9e'
from fredapi import Fred
fred = Fred(api_key='413d49541fa9fdb3f97029685315d5a5')
import pandas as pd
import numpy as np
# Import plotting packages. Despite its relative difficulty I will use plotly for its design and interactive features
from plotly.offline import iplot, iplot_mpl # plotting functions
import plotly.graph_objs as go # ditto
import cufflinks as cf # gives us df.iplot that feels like df.plot
import plotly.plotly as py
cf.set_config_file(offline=True, offline_show_link=False)
In [27]:
# Import the COT report on 'Light Sweet WTI Crude' and examine
WTI_COT = quandl.get('CFTC/CL_FO_ALL')
WTI_COT.head()
Out[27]:
In [28]:
# Also import the COT reports on 'Gold Comex - Futures and Options' for another dataset to look at later
Gold_COT = quandl.get('CFTC/GC_FO_ALL')
Gold_COT.head()
Out[28]:
In [29]:
# Clear up some unneeded columns
WTI_COT = WTI_COT.drop(['Swap Dealer Spreads',
'Money Manager Spreads',
'Other Reportable Spreads',
'Non Reportable Longs',
'Non Reportable Shorts'],
axis = 1)
In [30]:
Gold_COT = Gold_COT.drop(['Swap Dealer Spreads',
'Money Manager Spreads',
'Other Reportable Spreads',
'Non Reportable Longs',
'Non Reportable Shorts'],
axis = 1)
In [31]:
# Rename merged dataframe column titles for clarity - these are closer names to what these headings actually mean
col_list = list(WTI_COT.columns) # Collect column titles in a list for renaming
# This is easy as both dataframes have the same headings
col_list[1] = 'Commercial Longs'
col_list[2] = 'Commercial Shorts'
col_list[3] = 'Speculator Longs'
col_list[4] = 'Speculator Shorts'
col_list[7] = 'Public Longs'
col_list[8] = 'Public Shorts'
WTI_COT.columns = [col_list]
Gold_COT.columns = [col_list]
In [32]:
# We will use the FRED API
# Customize FRED reports using API to read weekly, ending Tuesday (This saves time later on)
WTI_Price = fred.get_series('DCOILWTICO',
observation_start = '2006-6-13',
observation_end = '2017-3-28',
frequency = 'wetu', # Weekly, ending Tuesday like COT reports
aggregation_method = 'eop')
WTI_Price = pd.DataFrame(WTI_Price)
WTI_Price = WTI_Price.iloc[::-1] # Need to save in descending order
Gold_Price = fred.get_series('GOLDAMGBD228NLBM',
observation_start = '2006-6-13',
observation_end = '2017-3-28',
frequency = 'wetu',
aggregation_method = 'eop')
Gold_Price = pd.DataFrame(Gold_Price)
Gold_Price = Gold_Price.iloc[::-1]
In [33]:
# Rename WTI_Price for clarity
WTI_Price.columns = ['WTI $/bbl']
WTI_Price.index.name = 'Date'
# Rename Gold_Price for clarity
Gold_Price.columns = ['Gold $/troy oz']
Gold_Price.index.name = 'Date'
In [34]:
# Check the shapes of each data set to ensure their lengths match for merging and correlation testing
Gold_Price.shape
Out[34]:
In [35]:
WTI_Price.shape
Out[35]:
In [36]:
# Merge the WTI and Gold price dataframes with their respective COT reports
# Be mindful of the date index as the COT report is ascending and price report descending in date
WTI_Merge = pd.merge(WTI_COT, WTI_Price,
how = 'inner', # Use the 'inner' setting because we only want info where the two datasets intersect
left_index = True,
right_index = True)
Gold_Merge = pd.merge(Gold_COT, Gold_Price,
how = 'inner',
left_index = True,
right_index = True)
In [37]:
# Plot historical prices of WTI and gold overlayed with open interest to look for pattern
WTI_Trace = go.Scatter(
x = WTI_Merge.index,
y = WTI_Merge['WTI $/bbl'],
name = 'WTI $/bbl')
WTI_Trace2 = go.Scatter(
x = WTI_Merge.index,
y = WTI_Merge['Open Interest'],
name = 'WTI Open Interest',
yaxis = 'y2')
data1 = [WTI_Trace, WTI_Trace2]
Gold_Trace = go.Scatter(
x = Gold_Merge.index,
y = Gold_Merge['Gold $/troy oz'],
name = 'Gold $/troy oz')
Gold_Trace2 = go.Scatter(
x = Gold_Merge.index,
y = Gold_Merge['Open Interest'],
name = 'Gold Open Interest',
yaxis = 'y2')
data2 = [Gold_Trace, Gold_Trace2]
layout = dict(
title = 'Price versus Open Interest',
width = 800,
height = 400,
yaxis = {'title':'Price ($)'},
yaxis2 = dict(
title='Open Interest',
overlaying='y',
side='right')
)
fig = go.Figure(data=data1, layout=layout)
iplot(fig)
fig = go.Figure(data=data2, layout=layout)
iplot(fig)
In [38]:
WTI_Merge.corrwith(WTI_Merge['WTI $/bbl'])
Out[38]:
In [39]:
Gold_Merge.corrwith(Gold_Merge['Gold $/troy oz'])
Out[39]:
In [40]:
print('WTI Shape:', WTI_Merge.shape)
print('Gold Shape:', Gold_Merge.shape)
In [41]:
week_list = [9,13,27,52] # Our time periods, less 4 weeks
correl_df1 = pd.DataFrame(WTI_Merge[4:555].corrwith(WTI_Merge['WTI $/bbl'][0:551]), columns = [4])
correl_df2 = pd.DataFrame(Gold_Merge[4:555].corrwith(Gold_Merge['Gold $/troy oz'][0:551]), columns = [4])
# Create the dataframes first with the four week correlations to capture row labels
for i in week_list:
correl_df1[i] = list(WTI_Merge[i:555].corrwith(WTI_Merge['WTI $/bbl'][0:555-i]))
correl_df2[i] = list(Gold_Merge[i:555].corrwith(Gold_Merge['Gold $/troy oz'][0:555-i]))
In [42]:
# Because we don't need the correlation of gold or oil with itself
correl_df1 = correl_df1.drop('WTI $/bbl', axis=0)
correl_df2 = correl_df2.drop('Gold $/troy oz', axis=0)
In [43]:
# Let's use a plotly heatmap on these two datasets to quickly find strong correlations
col_labels = ['1 month', '2 months', '3 months', '6 months', '12 months']
correl_variable = list(correl_df1.index) # For titling
df1_trace = [go.Heatmap(z = correl_df1.values.tolist(),
y = correl_variable,
x = col_labels,
colorscale = 'Viridis'
)]
df2_trace = [go.Heatmap(z = correl_df2.values.tolist(),
y = correl_variable,
x = col_labels,
colorscale = 'Viridis'
)]
layout3 = dict(title = 'WTI Correlation Heatmap',
width= 800,
height= 600,
yaxis = dict(tickfont = {'size':6}),
xaxis = {'title':'COT Correlation Lag Timing'})
layout4 = dict(title = 'Gold Correlation Heatmap',
width= 800,
height= 600,
yaxis = dict(tickfont = {'size':6}),
xaxis = {'title':'COT Correlation Lag Timing'})
fig = go.Figure(data=df1_trace, layout=layout3)
iplot(fig)
fig = go.Figure(data=df2_trace, layout=layout4)
iplot(fig)
In [44]:
# Tracking the correlations over the different time lags
bar_trace = [go.Bar(y = list(correl_df1.loc['Money Manager Shorts']),
x = col_labels,
opacity = 0.6,
marker=dict(
color=['rgba(204,204,204,1)', 'rgba(204,204,204,1)',
'rgba(222,45,38,0.8)', 'rgba(204,204,204,1)',
'rgba(204,204,204,1)'])
)]
layout5 = dict(title = 'Money Manager Shorts Correlations',
yaxis = dict(
title = 'Money Manager Shorts Correlations',
range = [-0.55,-0.65]),
)
fig = go.Figure(data=bar_trace, layout=layout5)
iplot(fig)
In [45]:
# Also, let's use some of the code above to chart the price of WTI versus money manager shorts
WTI_MM_Trace = go.Scatter(
x = WTI_Merge.index,
y = WTI_Merge['WTI $/bbl'],
name = 'WTI $/bbl')
WTI_MM_Trace2 = go.Scatter(
x = WTI_Merge.index,
y = WTI_Merge['Money Manager Shorts'],
name = 'MM Shorts',
yaxis = 'y2')
data3 = [WTI_MM_Trace, WTI_MM_Trace2]
layout6 = dict(
title = 'WTI Price versus Money Manager Shorts',
width = 800,
height = 400,
yaxis = {'title':'Price ($/bbl)'},
yaxis2 = dict(
title='Total Shorts',
overlaying='y',
side='right'),
showlegend = False,
shapes= [ {
'type': 'line',
'x0': '2014-08-10',
'y0': 0,
'x1': '2014-08-10',
'y1': 150,
'line': {
'color': 'red',
'width': 1
}},
{
'type': 'line',
'x0': '2006-12-23',
'y0': 0,
'x1': '2006-12-23',
'y1': 150,
'line': {
'color': 'red',
'width': 1
}},
{
'type': 'line',
'x0': '2010-05-20',
'y0': 0,
'x1': '2010-05-20',
'y1': 150,
'line': {
'color': 'red',
'width': 1
}}]
)
fig = go.Figure(data=data3, layout=layout6)
iplot(fig)
In [46]:
# First, set our sell and buy boundaries
Sell_Boundary = WTI_Merge['Money Manager Shorts'].quantile(q = 0.80)
Buy_Boundary = WTI_Merge['Money Manager Shorts'].quantile(q = 0.20)
print('Sell Boundary when short interest is greater than', Sell_Boundary)
print('Buy Boundary when short interest is less than', Buy_Boundary)
# Next, let's look at some descriptive statistics
WTI_Merge['Money Manager Shorts'].describe()
Out[46]:
In [47]:
# Now, we'll make two additional columns in our DataFrame for when a buy or sell signal occurs
counter = 0
WTI_Merge['Sell Signal'] = np.nan
WTI_Merge['Buy Signal'] = np.nan
for i in WTI_Merge['Money Manager Shorts']:
if i >= Sell_Boundary:
WTI_Merge['Sell Signal'][counter] = 1
WTI_Merge['Buy Signal'][counter] = 0
elif i <= Buy_Boundary:
WTI_Merge['Buy Signal'][counter] = 1
WTI_Merge['Sell Signal'][counter] = 0
else:
WTI_Merge['Sell Signal'][counter] = 0
WTI_Merge['Buy Signal'][counter] = 0
counter += 1
In [48]:
# We now have a list of buy and sell signals with 1 denoting True and 0 denoting False
Total_Sells = sum(WTI_Merge['Sell Signal'])
Total_Buys = sum(WTI_Merge['Buy Signal'])
print('The total number of sell orders are', Total_Sells)
print('The total number of buy orders are', Total_Buys)
# Let's also see the average sell signal price and average buy signal price
counter = 0
sell_sum = 0
buy_sum = 0
for i in WTI_Merge['WTI $/bbl']:
sell_sum += i * WTI_Merge['Sell Signal'][counter]
buy_sum += i * WTI_Merge['Buy Signal'][counter]
counter += 1
# No if statement is needed here because a non-signal is denoted by 0 and will zero out the buy or sell price
print('The average sell price is', sell_sum/Total_Sells)
print('The average buy price is', buy_sum/Total_Buys)
In [49]:
new_col_list = list(WTI_Merge.columns)
new_col_list[-1] = 'New Sell Signal'
new_col_list[-2] = 'New Buy Signal'
WTI_Merge.columns = new_col_list
In [50]:
# Our portfolio will either be completely long or completely short depending on the buy or sell signal
# This is easier said than done. We need to combine the buy and sell signals into one column
# Buys will be signaled by +1, sells by -1, no more 0's
WTI_Merge['Signal Column'] = WTI_Merge['New Buy Signal'] - WTI_Merge['New Sell Signal']
counter = len(WTI_Merge.index) - 1
for i in reversed(WTI_Merge['Signal Column']): # We must loop in reverse because the portfolio will build in an ascending manner
if i == 0:
WTI_Merge['Signal Column'][counter] = WTI_Merge['Signal Column'][counter+1] # Signal becomes the one below it
elif i == 1.0:
WTI_Merge['Signal Column'][counter] = 1.0
elif i == -1.0:
WTI_Merge['Signal Column'][counter] = -1.0
counter -= 1
In [51]:
# Create a simulated portfolio starting with $10,000 at the beginning of our data
# Portfolio is either 100% long or short as denoted by the single 'Signal Column'
WTI_Merge['Portfolio'] = np.nan
counter = len(WTI_Merge.index)-1
WTI_Merge['Portfolio'][555] = 10000 # Starting portfolio amount
for i in reversed(WTI_Merge['WTI $/bbl']):
if WTI_Merge['Signal Column'][counter] == 1.0: # If the previous signal was a buy, you are long the returns on WTI
WTI_Merge['Portfolio'][counter-1] = ((WTI_Merge['WTI $/bbl'][counter-1] - i)/(i) + 1) * WTI_Merge['Portfolio'][counter]
elif WTI_Merge['Signal Column'][counter] == -1.0: # If the previous signal was a sell, you are short returns
WTI_Merge['Portfolio'][counter-1] = (1 - (WTI_Merge['WTI $/bbl'][counter-1] - i)/(i)) * WTI_Merge['Portfolio'][counter]
counter -= 1 # Drop the counter by one each loop because we are iterating in reverse
WTI_Merge['Portfolio'][555] = 10000
WTI_Merge['Portfolio'].head()
Out[51]:
In [52]:
WTI_Merge.shape
Out[52]:
In [53]:
# Compare to the S&P 500 market portfolio
SP500 = quandl.get("YAHOO/INDEX_GSPC",start_date="2006-06-13",end_date="2017-03-28", collapse='weekly')
SP500['Returns'] = np.nan # Create a column of just percentage returns
counter = 0
SP500['Returns'][0] = 1
for i in SP500['Close']:
try: # To ignore the necessary 'index out of bounds' error that occurs from [counter+1]
SP500['Returns'][counter+1] = (((SP500['Close'][counter+1]-i)/i)+1) * SP500['Returns'][counter]
counter += 1
except:
''
In [54]:
WTI_Merge['Returns'] = np.nan # Create a column of just percentage returns for the Strategic Portfolio
counter = len(WTI_Merge.index)-1
WTI_Merge['Returns'][counter] = 1
for i in reversed(WTI_Merge['Portfolio']):
try: # To ignore the necessary 'index out of bounds' error that occurs from [counter+1]
WTI_Merge['Returns'][counter-1] = (((WTI_Merge['Portfolio'][counter-1]-i)/i)+1) * WTI_Merge['Returns'][counter]
counter -= 1
except:
''
In [55]:
# Plot the portfolio return versus the market return
WTI_Portfolio_Trace = go.Scatter(
x = WTI_Merge.index,
y = WTI_Merge['Returns'],
name = 'Strategic Portfolio Returns')
SP500_Trace2 = go.Scatter(
x = SP500.index,
y = SP500['Returns'],
name = 'Market Portfolio Returns',
)
data4 = [WTI_Portfolio_Trace, SP500_Trace2]
layout7 = dict(
title = 'Strategic versus Market Portfolio',
width = 1000,
height = 400,
yaxis = {'title':'Portfolio Return Multiple'},
showlegend = True)
fig = go.Figure(data=data4, layout=layout7)
iplot(fig)
In [56]:
# Let's take some comparative statistics of these two datasets now
WTI_Merge['Returns'].describe()
Out[56]:
In [57]:
SP500['Returns'].describe()
Out[57]:
In [58]:
# Finally, just comparing some last metrics about usefulness
Sharpe = (WTI_Merge['Returns'][0]-SP500['Returns'][0])/WTI_Merge['Returns'].describe()['std']
Max_Drawdown = WTI_Merge['Returns'].describe()['min']
print('The strategy Sharpe ratio is', Sharpe)
print('The strategy max drawdown is', Max_Drawdown)
With the caveats of "correlation does not equal causation" and "past results do not predict future returns" in mind, this trading strategy still produced some impressive returns compared to the market portfolio. The Sharpe Ratio measures excess returns from the market per unit of additional risk taken on. In this case, a Sharpe of 2.38 shows that the trading strategy provides an impressive level of return per unit of risk taken on compared to the S&P. Despite this extensive analysis, there are still more ways this strategy would need to be tested before being practical. A few factors that were left out...
It appears that the strategy was most productive in times where oil prices were steadily increasing (2007 to mid-2008) or steadily falling (2014-2015). By adding some sort of market volatility input, it's possible we could have developed an on/off switch for when the strategy would be active.
Overall, this is an interesting look into the COT reports which suggests that they may not have any broad predictive powers but they certainly can carry useful information.